Bandwidth-Efficient Collective Communication for Clustered Wide Area Systems
نویسندگان
چکیده
Metacomputing infrastructures couple multiple clusters (or MPPs) via wide-area networks and thus allow parallel programs to run on geographically distributed resources. A major problem in programming such wide-area parallel applications is the difference in communication costs inside and between clusters. Latency and bandwidth of WANs often are orders of magnitude worse than those of local networks. Our MagPIe library eases wide-area parallel programming by providing an efficient implementation of MPI’s collective communication operations. MagPIe exploits the hierarchical structure of clustered wide-area systems and minimizes the communication overhead over the WAN links. In this paper, we present improved algorithms for collective communication that achieve shorter completion times by simultaneously using the aggregate bandwidth of the available wide-area links. Our new algorithms split messages into multiple segments that are sent in parallel over different WAN links, thus resulting in a better utilization of the (scarce) wide-area bandwidth. To determine the optimal segment size and the best communication tree shape, we introduce a new performance model that extends LogP. We present communication algorithms for several collective operations, which we implemented in the MagPIe library. An experimental performance evaluation shows that the new algorithms significantly improve the performance of several collective operations (for large messages) and that there is a close match between the theoretical model and the measured completion times for most operations. Also, we show that our approach finds segment sizes and tree shapes that are close to optimal in most cases.
منابع مشابه
MPI’s Reduction Operations in Clustered Wide Area Systems
The emergence of meta computers and computational grids makes it feasible to run parallel programs on large-scale, geographically distributed computer systems. Writing parallel applications for such systems is a challenging task which may require changes to the communication structure of the applications. MPI’s collective operations (such as broadcast and reduce) allow for some of these changes...
متن کاملPerformance of Multi-beam Satellite Systems With A New Bandwidth Sharing Algorithm
An efficient resource allocation is important to guarantee the best performance with a fair distribution of multi-beam satellite capacity to provide satellite multimedia and broadcasting services. In this way, available bandwidth and capacity problems in new satellite system likes Multi-Input-Multi-Output (MIMO), exploring new techniques for enhancing spectral efficiency in satellite communicat...
متن کاملCompact CPW-fed Circular Patch Antenna for UWB Applications
In this paper, a novel CPW-fed antenna is presented for UWB applications. The antenna mainly comprises of a simple circular patch and a modified ground plane. The inclusion of two novel symmetrical rectangular slots with inner area of 3×2.8 mm2 to the top corners of the antenna creates a new path for the current and consequently leads to the bandwidth enhancement. A rectangular stub is also ado...
متن کاملJointly power and bandwidth allocation for a heterogeneous satellite network
Due to lack of resources such as transmission power and bandwidth in satellite systems, resource allocation problem is a very important challenge. Nowadays, new heterogeneous network includes one or more satellites besides terrestrial infrastructure, so that it is considered that each satellite has multi-beam to increase capacity. This type of structure is suitable for a new generation of commu...
متن کاملGenerating an Efficient Dynamics Multicast Tree under Grid Environment
The use of an efficient multicast tree can substantially speed up many communication-intensive MPI applications. This is even more crucial for Grid environment since MPI runtime has to work on wide area network with very different and unbalanced network bandwidth. This paper proposes a new and efficient algorithm called, GADT (Genetics Algorithm based Dynamics Tree) that can be used to generate...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000